Visual Relationship Detection with Language Priors

نویسندگان

  • Cewu Lu
  • Ranjay Krishna
  • Michael S. Bernstein
  • Li Fei-Fei
چکیده

Visual relationships capture a wide variety of interactions between pairs of objects in images (e.g. “man riding bicycle” and “man pushing bicycle”). Consequently, the set of possible relationships is extremely large and it is difficult to obtain sufficient training examples for all possible relationships. Because of this limitation, previous work on visual relationship detection has concentrated on predicting only a handful of relationships. Though most relationships are infrequent, their objects (e.g. “man” and “bicycle”) and predicates (e.g. “riding” and “pushing”) independently occur more frequently. We propose a model that uses this insight to train visual models for objects and predicates individually and later combines them together to predict multiple relationships per image. We improve on prior work by leveraging language priors from semantic word embeddings to finetune the likelihood of a predicted relationship. Our model can scale to predict thousands of types of relationships from a few examples. Additionally, we localize the objects in the predicted relationships as bounding boxes in the image. We further demonstrate that understanding relationships can improve content based image retrieval.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Generating Image Descriptions with Gold Standard Visual Inputs: Motivation, Evaluation and Baselines

In this paper, we present the task of generating image descriptions with gold standard visual detections as input, rather than directly from an image. This allows the Natural Language Generation community to focus on the text generation process, rather than dealing with the noise and complications arising from the visual detection process. We propose a fine-grained evaluation metric specificall...

متن کامل

Comparative Approach to the Relationship Between Text and Hand Visual Language in Tahmasebi’s Shahnameh Pictures

The painters of Tahmasbi Shahnameh, in order to depict the text full of the story of Shahnameh, tried to convey emotions and excitement to the audience by using the visual language of the hand. Due to the multiplicity of applications of this type of nonverbal communication in different situations, the painter may have undergone changes in parts of her painting under the influence of various fac...

متن کامل

Ambiguities and conventions in the perception of visual art

Vision perception is ambiguous and visual arts play with these ambiguities. While perceptual ambiguities are resolved with prior constraints, artistic ambiguities are resolved by conventions. Is there a relationship between priors and conventions? This review surveys recent work related to these ambiguities in composition, spatial scale, illumination and color, three-dimensional layout, shape, ...

متن کامل

A Channel-Based Perspective on Conjugate Priors

A desired closure property in Bayesian probability is that an updated posterior distribution is in the same class of distributions — say Gaussians — as the prior distribution. When the updating takes place via a likelihood, one then calls the class of prior distributions the ‘conjugate prior’ of this likelihood. This paper gives (1) an abstract formulation of this notion of conjugate prior, usi...

متن کامل

Multiple Intelligence and EFL Learners' Reading Comprehension

The second half of the twentieth century can be called the age of individualization when individual values and differences are recognized and respected. Intelligence is among the various aspects of individual differences which affect education and language learning. As such, the present study aimed at investigating the relationship between Multiple Intelligence and Reading Comprehension Abiliti...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016